Skip to content

BOOKKEEPER-968 Entry log flushes at configurable chunks#77

Closed
dlg99 wants to merge 1 commit into
apache:masterfrom
dlg99:task/entry-log-flush
Closed

BOOKKEEPER-968 Entry log flushes at configurable chunks#77
dlg99 wants to merge 1 commit into
apache:masterfrom
dlg99:task/entry-log-flush

Conversation

@dlg99
Copy link
Copy Markdown
Contributor

@dlg99 dlg99 commented Nov 11, 2016

With this patch one can configure interval (in bytes) for entry log to flush writes to the disk.

@dlg99
Copy link
Copy Markdown
Contributor Author

dlg99 commented Nov 11, 2016

Failed test seems to be unrelated/flapper

@merlimat
Copy link
Copy Markdown
Contributor

@dlg99 Have you considered the approach of using a rate limiter (eg: guava RateLimiter) with a configurable max number of MBytes/s to write when flushing?
It might be a bit inconvenient to properly tune for one's system, but setting the rate slightly below the disk write capacity should do the trick.

@merlimat
Copy link
Copy Markdown
Contributor

On the other hand, we might even be able to do some auto-tuning: If I was only able to write let's say 200 MB in 1 second without the throttling. Next time I'll try to throttle at 190MB/s to not saturate the disk.

@dlg99
Copy link
Copy Markdown
Contributor Author

dlg99 commented Nov 12, 2016

@merlimat limiting write rate won't help. One can write with relatively low write rate but since entry log only gets fileChannel.force() on log rotation we end up with a lot of writes cached in memory.
Once OS (when cache is too full or because of explicit fileChannel.force()) decide that it is time to flush the data to disk we have no control over the rate of writes. It does it at max possible speed, flushing 2048M on ssd with 400M/sec write throughput takes 5 sec if there are no other writes/reads.
I.e. in our production env I have configured flush to happen at every 10MB of entry log writes and so far I am seeing a lot less read latency spikes. I'll get some charts to show during meetup.

One can experiment with linux config (dirty_write_bytes / backround_dirty_write_bytes IIRC) but these are OS-wide setting (not per disk) and will affect other writes, i.e. I would not want to decrease these parameters to 10-50M for the rotational disk where we write application logs. Limiting these to the range of hundreds MB does not help much with the specific problem on hands.

@jiazhai
Copy link
Copy Markdown
Member

jiazhai commented Nov 16, 2016

@sijie , should we add items in doc/bookkeeperConfigParams.textile along with this changes?

In this link:
https://cwiki.apache.org/confluence/display/BOOKKEEPER/Contributing+to+BookKeeper
it said:
New server configuration settings should be added and documented in bookkeeper-server/conf/bk_server.conf, and also documented in doc/bookkeeperConfigParams.textile.

But seems, this textile file contains only the tip of the iceberg, a lot of config parameter is not added in. If needed, we may need a ticket to fix this.

Copy link
Copy Markdown
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good to me.

@sijie
Copy link
Copy Markdown
Member

sijie commented Nov 21, 2016

+1 on what @jiazhai suggested. we should document all the new configurations.

@jvrao
Copy link
Copy Markdown
Contributor

jvrao commented Nov 21, 2016

+1 on the patch.
@sijie we may need a jira item for what @jiazhai suggested.

@sijie
Copy link
Copy Markdown
Member

sijie commented Nov 21, 2016

@jvrao yup. another jira is fine. We need some jiras for documenting the settings introduced in 4.5.0 and mark them as blockers for 4.5.0 release.

@sijie
Copy link
Copy Markdown
Member

sijie commented Nov 21, 2016

+1 for this change

@asfgit asfgit closed this in 26b09ab Jan 31, 2017
sijie pushed a commit to sijie/bookkeeper that referenced this pull request Jan 26, 2018
* changed all instances of `assert()` to junit versions in `src/test`

Here is the script I used to find all instances of `assert()` inside the `src/test` folder:

```
grep -r "assert(" * 2>/dev/null | grep -v "main"
```

The `grep -v "main"` removes all instances of the usage within the main source tree (which there are quite a few). I did this as I assumed the JIRA ticket spirit was not to remove those instances from the main tree and thus would require including `junit` in core compilation rather than scoped for `test` as is.

Author: Brennon York <brennon.york@capitalone.com>

Reviewers: Sijie Guo <sijie@apache.org>

Closes apache#77 from brennonyork/DL-122
reddycharan pushed a commit to reddycharan/bookkeeper that referenced this pull request May 21, 2018
)

* (@bug W-4698540@) Increase openLedgerRereplicationGracePeriod

openLedgerRereplicationGracePeriod defines how long the replication
worker waits to fence an active extent before starting replication.
The default value is 30 sec. Which should be enough for the writer
to detect the bookie down incident and make an ensemble change.
But given that we don't have back-pressure, sometimes client and
bookies get backed up and may take longer time to detect and change
the ensemble. Changing this value to 10 mins.

Making this longer should not have any major impact as the writer is
confirmed to everything it wrote. Also now we have explicitLAC so
even delaying the fencing won't stop/restrict reader reading even
the last entry/fragment

Signed-off-by: Venkateswararao Jujjuri (JV) <vjujjuri@salesforce.com>

* Applied review comments
@dlg99 dlg99 deleted the task/entry-log-flush branch October 14, 2021 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants